Approximation Guarantees for Max Sum and Max Min Facility Dispersion with Parameterised Triangle Inequality and Applications in Result Diversification

نویسنده

  • Marcin Sydow
چکیده

Facility Dispersion Problem, originally studied in Operations Research, has recently found important new applications in Result Diversification approach in information sciences. This optimisation problem consists in selecting a small set of p items out of a large set of candidates to maximise a given objective function. The function expresses the notion of dispersion of a set of selected items in terms of a pair-wise distance measure between items. In most known formulations the problem is NP-hard, but there exist 2-approximation algorithms for some cases if distance satisfies triangle inequality. We present generalised 2/α approximation guarantees for the Facility Dispersion Problem in its two most common variants: Max Sum and Max Min, when the underlying dissimilarity measure satisfies parameterised triangle inequality with parameterα. The results apply to both relaxed and stronger variants of the triangle inequality. We also demonstrate potential applications of our findings in the result diversification problem including web search or entity summarisation in semantic knowledge graphs, as well as in practical computations on finite data sets.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Max-Sum Diversification, Monotone Submodular Functions and Dynamic Updates

Result diversification is an important aspect in web-based search, document summarization, facility location, portfolio management and other applications. Given a set of ranked results for a set of objects (e.g. web documents, facilities, etc.) with a distance between any pair, the goal is to select a subset S satisfying the following three criteria: (a) the subset S satisfies some constraint (...

متن کامل

Max-Sum Diversity Via Convex Programming

Diversity maximization is an important concept in information retrieval, computational geometry and operations research. Usually, it is a variant of the following problem: Given a ground set, constraints, and a function f(·) that measures diversity of a subset, the task is to select a feasible subset S such that f(S) is maximized. The sum-dispersion function f(S) = ∑ x,y∈S d(x, y), which is the...

متن کامل

Deterministic Algorithms for Multi-criteria TSP

We present deterministic approximation algorithms for the multi-criteria traveling salesman problem (TSP). Our algorithms are faster and simpler than the existing randomized algorithms. First, we devise algorithms for the symmetric and asymmetric multicriteria Max-TSP that achieve ratios of 1/2k − ε and 1/(4k − 2) − ε, respectively, where k is the number of objective functions. For two objectiv...

متن کامل

A On Approximating Multi-Criteria TSP

We present approximation algorithms for almost all variants of the multi-criteria traveling salesman problem (TSP). First, we devise randomized approximation algorithms for multi-criteria maximum traveling salesman problems (Max-TSP). For multi-criteria Max-STSP, where the edge weights have to be symmetric, we devise an algorithm with an approximation ratio of 2/3 − ε. For multi-criteria Max-AT...

متن کامل

Max-Sum Diversification, Monotone Submodular Functions and Semi-metric Spaces

In many applications such as web-based search, document summarization, facility location and other applications, the results are preferable to be both representative and diversified subsets of documents. The goal of this study is to select a good “quality”, bounded-size subset of a given set of items, while maintaining their diversity relative to a semimetric distance function. This problem was...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014